Introduction

What is this

This notebook will include informal meta-analyses of different metrics and methods for evaluating surgical skill.

The reported metrics compare differences between novices and expert surgeons.

It is informal because it’s not based on systematic review, and because some studies have been included with very relaxed conditions. For example, I have picked the novices and experts without comparing their definitions between studies. Novice = weakest skill group in the study, expert = strongest skill group in the study. If a study included more than 2 groups, I picked the weakest (=novice) and strongest (=expert) groups’ results and discarded the others. If a study included more than 1 task, or several sub-tasks, I picked the one with largest difference between groups.

Many papers did report means and standard deviations explicitly, so they had to be estimated from boxplots/barplots, or by some other means

For example, sometimes studies reported only mean or median, but no SE/SD. I estimated the SD/SE in those cases based e.g. on the SD of some other similar metric that they reported, or the SD of previous results for the same metric. See the excel file for notes on each study.

May or may not be turned into more systematic meta-analysis later.

Example metrics that will be most likely included (Bolded ones have priority)

  • Task time
  • Tool Path length
  • Tool Velocity
  • Tool Acceleration
  • Tool Curvature
  • Idle time
  • Pupil dilations
  • Blinks
  • Tool Movement efficiency
  • Number of movements
  • Tool Forces
  • Tool Torques
  • Bimanual dexterity
  • Jerk
  • Fixation duration
  • Saccade amplitudes
  • EEG?
  • Surgical Evaluation Instruments (SEI)

Full list of papers and metrics can be found in the excel file shared in the repo:

Link to Github repo

Last update: 19.7.2022.: Added more results. Changed Laparoscopy -> Endoscopy, so all endoscopic procedures are labeled ‘endoscopy’

Submit results

If you notice errors or know some good studies to be included, feel free to forward them to

jani.koskinen [ at ] uef.fi

or use the form below TBD

How results are calculated

  1. From each study, extract
  • Number of trials per group (Nn, Ne, for novices and experts, respectively)
  • Means per group (Mn, Me for novices and experts, respectively)
  • Standard deviations per group (SDn, SDe)
  1. Calculate pooled standard deviation SDpooled
  2. Normalize by calculating Standardized Mean Difference (SDM): (Mn - Me)/SDpooled
  3. Calculate small sample size correction g = SMD*(1 - 3/(4n - 9)), where n is the total sample size of the study (both groups combined).
  4. Calculate SDg, standard deviation after correction

These values are used as input in the R meta package’s metagen function.

For more information, check:

Doing Meta-Analysis with R: A Hands-On Guide

Forest plots

Forest plot explanation

Summary of Included Studies

Some general statistics of the studies included:

Number of unique studies: 88

Number of studies by surgical technique:

Var1 Freq
Endoscopy 44
Microsurgery 14
Open Surgery 12
Radiography 1
Robotic Surgery 8

Number of studies by metric:

Technique Count
task_time 35
tool_path_length 24
tool_velocity 16
tool_idle 8
tool_movements 16
tool_jerk 14
tool_acceleration 8
tool_bimanual 7
pupil_dilation 7
tool_force 12
scale_OSATS 9

Sample size estimation (Work in progress)

How many samples needed at some effect size d? At alpha = 0.05 and power = 0.8 and using t-test. Assuming independent trials (e.g. no multiple measurements from same participants etc.)

Hover mouse over the points in the plot to see the values. Sample size is for group, so you need this many samples per group

Some baseline effect sizes from the meta-analyses given as baseline:

IT = Idle Time

TT = Task Time

BD = Bimanual Dexterity

TEPR = Task-Evoked Pupil Reaction/Dilation (Esimated without one outlier study removed)

TJ = Tool Jerk

TF = Tool Force